-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement insert_many<IntoIterator> for SmallVec #28
Conversation
You can not take ownership of Maybe it’s better to take an |
005a827
to
3f7606f
Compare
Thank you for the feedback! I think I have a much better understanding of why it wasn't working. I intended on adding the insert_slice to leverage memcpy, so it makes a lot of sense to restrict insert_slice to arrays whose types |
I spent a little more time thinking about this and realize that leveraging memcpy is the perfect use of impl specialization. I think I will adapt this diff to implement "insert_many" using an IntoIterator and create a separate PR for using specialization to get a memcpy leveraged implementation. https://github.com/rust-lang/rfcs/blob/master/text/1210-impl-specialization.md |
Unfortunately specialization is not stable yet and I’d like to keep this crate working on stable Rust. Beyond that, there was also a very similar discussion in Rust’s standard library, about specializing In the end, specialization was not used and |
3f7606f
to
85a4a07
Compare
Understood. I've adapted this diff to be completely iterator-compatible (take another look now) without using any memcpy type primitives or specialization. This diff should be safe to add as a backward compatible change in its current state, as long as we're willing to include an "insert_many" operation or something of the sort. I'll create a new diff which will work around the lack of specialization by adding new methods (extend_slice and insert_slice) which take advantage of memcpy semantics for copy type slices. |
Is there a specific need for all this? What is it? I’d rather not add lots of API surface just because we can. |
We were looking to use this here. There was a spot where we needed to insert many items into a smallvec (preferably efficiently). Currently, we've rolled our own "ElasticArray" with this functionality, but would prefer to use a more supported library like this. |
Add benchmarks for push, insert, extend, and pushpop Want to add these benchmarks so that we can more effectively detect regressions/improvements from PR's like #28 and #29. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-smallvec/31) <!-- Reviewable:end -->
85a4a07
to
e7246e9
Compare
Here are the benchmarks as seen on my machine for this
Looks like insert_many improves upon insert as expected. |
This is definitely a micro-optimization sensitive piece of code, so each little change can affect benchmarks. Benchmark tests were invaluable in making sure this didn't regress! |
d7175ae
to
a9fc702
Compare
I made some minor tweaks and noticed it changed the benchmark output for bench_insert (even though we didn't touch that function). I disassembled the outputs and have come to the conclusion that different inlining is creating different performance characteristics. On master, the insert() function is inlined into the benchmark, but after my change, the compiler chose to make insert() a function call. This is all pointing to somewhat bogus benchmarking because external callers of insert() shouldn't be seeing this cross-crate inlining. I think in order to get a true, sense the benchmarks need to be in a separate crate, so I will give that a try (maybe using the tests/ directory?).
In any case, I think this iteration of the code is pretty good, but I'd like to improve the benchmarks so we don't see this regression due to inlining. |
Going to wait for #32 to land first since it sets up the noinline bench wrapper to get more consistent results. |
☔ The latest upstream changes (presumably #32) made this pull request unmergeable. Please resolve the merge conflicts. |
Add benchmark for insert_many
a9fc702
to
15c41c8
Compare
Some improvements
I feel good about this iteration for handling the insert-many use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in reviewing! I wanted to take the time to reason through this implementation carefully, and I'm glad I did!
unsafe { | ||
let ptr = self.as_mut_ptr().offset(index as isize); | ||
let old_len = self.len; | ||
ptr::copy(ptr, ptr.offset(lower_size_bound as isize), old_len - index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to assert index <= old_len or this is unsafe.
} | ||
let num_added = self.len - old_len; | ||
if num_added < lower_size_bound { | ||
// Iterator provided less elements than the hint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/less/fewer/
self.reserve(lower_size_bound); | ||
|
||
unsafe { | ||
let ptr = self.as_mut_ptr().offset(index as isize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these casts to isize when offsetting are unsafe, since they could become negative numbers and cause us to write outside of the bounds of the array. We should either use to_isize.unwrap()
and panic if that occurs, or use a conversion strategy that yields a value that will gives us worse performance but correct behaviour.
Thank you for the careful review! I'm taking another pass at the change with careful attention to integer overflow and casting cases. |
At a quick readthrough, it looks like push, pop, truncate, remove, insert all suffer from the same issue if len > isize::MAX. It looks like one might be able to manipulate that situation in a call to insert() when len == isize::MAX (then the len would get set to size::MAX + 1). In any case, here is an updated revision with some bounds checks to insert_many. |
I'm terribly sorry that I dropped the ball here! |
📌 Commit 1c254ff has been approved by |
Implement insert_many<IntoIterator> for SmallVec This doesn't work as is, but I wanted to put it up for feedback. According to these docs: https://doc.rust-lang.org/std/ptr/fn.copy_nonoverlapping.html The copy_nonoverlapping function copies memory as intended, but the ownership of the values copied are not copied over to dest, so when the box is consumed, the SmallVec does not maintain ownership. Not sure how to do this with unsafe code. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-smallvec/28) <!-- Reviewable:end -->
☀️ Test successful - status-travis |
Implement extend_from_slice and insert_from_slice with memmove optimization Implement the performance optimized versions of insert_many and extend (from #28). These methods use memmove rather than a looped insert. If we had function specialization, we could implement these without exposing new methods. Up to the maintainers whether we want to support these new methods. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-smallvec/29) <!-- Reviewable:end -->
This doesn't work as is, but I wanted to put it up for feedback.
According to these docs:
https://doc.rust-lang.org/std/ptr/fn.copy_nonoverlapping.html
The copy_nonoverlapping function copies memory as intended, but the ownership of the values copied are not copied over to dest, so when the box is consumed, the SmallVec does not maintain ownership. Not sure how to do this with unsafe code.
This change is